Redundancy Reduction in 3D Facial Motion Capture Data for Animation
Daniela I. Wellein ∗† Crist´ bal Curio ‡ o Heinrich H. B¨ lthoff § u Max Planck Institute for Biological Cybernetics, T¨ bingen, Germany u
Figure 1: Example animation for facial motion analysis with complete (left) and selected set of markers (middle). Best marker set in terms of low error and low number of markers (right). Research on the perception of dynamic faces often requires realtime animations with low latency. With an adaptation of principal feature analysis [Cohen et al. 2002], we can reduce the number of facial motion capture markers by 50%, while retaining the overall animation quality. Figure 2: Error curves for selected marker sets vs. the number of markers. Best marker set is marked with black circle. transformed features and will therefore form a cluster. The cluster means correspond to the most uncorrelated features. The results of the clustering depend on the randomized initialization of k-Means. To overcome this, 1000 iterations are run per value of k. The parameter k is varied to find the clusters that best reflects the structure of the feature set. Third, the selected sets of features are converted to sets of markers. A symmetrical selection of markers on the left and right half of the face is enforced. If at least one feature belonging to a symmetrical marker pair is in the feature set, the marker pair will be included. Finally, the most stable marker selection out of 1000 iterations of k-Means is identified. Therefore, the selection frequency of each markers pair is determined. We consider the marker selection with the highest sum of marker pair frequencies as the most stable one.
1 Facial Animation System
[Curio et al. 2006] proposed a performance-driven facial animation system. The semantics of an actor’s movement are transferred from motion capture recordings to a corresponding model of 3D scans. This correspondence is established with parallel blendshapes. A 3D head model is animated by morphing a linear combination of the 3D scanned blendshapes. The weights for the linear combination have been previously determined by facial motion analysis. Analysis consists of removal of rigid head motion from the motion capture data, construction of the motion capture blendshape basis and finding the optimal linear combination of blendshapes with minimal least-square error to recorded motion capture marker positions.
3 Results
The different marker sets have been evaluated on motion capture data obtained from three different actors (ME, MB and JZ). For these data, facial motion analysis (Section 1) is performed on the selected and the complete marker set. The used error is the mean squared error (MSE) of blendshape weights estimated with both marker sets. In Figure 2 the MSE is plotted against the number of selected markers. The curve is not monotone, because of the independency of the individual clustering results. A local minimum can be observed for 35 markers, which is present for all actors (Figure 2). The animation example for a frame with a high MSE shows that facial motion is correctly analyzed with this selected set of markers (Figure 1, right and middle). The selected set of markers (Figure 1, right) is optimal in terms of low number of markers and low error.
2 Feature Selection Algorithm
Principal Feature Analysis (PFA) is an unsupervised feature selection method, based on Principal Component Analysis (PCA). It aims at selecting features with minimal mutual correlation. When PFA is applied to marker selection, one 3D marker consists of three features. The algorithm consists of the following steps. First, the input features are PCA transformed. Retaining 98% variance of the data leads to a reduced matrix of Eigenvectors. Each dimension of the Eigenvectors denotes a PCA transformed feature. Second, the PCA transformed features are clustered via k-Means. Highly correlated features are grouped together in the space of the PCA
∗ currently † e-mail:
working at ICCAS, University of Leipzig daniela.wellein@iccas.de ‡ e-mail: cristobal.curio@tuebingen.mpg.de (contact author) § e-mail: heinrich.buelthoff@tuebingen.mpg.de
References
C OHEN , I., T IAN , Q., Z HOU , X., AND H UANG , T. 2002. Feature selection using principal feature analysis. Univ. of Illinois at Urbana-Champaign 2002. C URIO , C., B REIDT, M., K LEINER , M., V UONG , Q. C., G IESE , ¨ M. A., AND B ULTHOFF , H. H. 2006. Semantic 3d motion retargeting for facial animation. In Proc. of Applied Perception in Graphics and Visualization 2006, 77–84.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail permissions@acm.org. APGV 2007, Tübingen, Germany, July 26–27, 2007. © 2007 ACM 978-1-59593-670-7/07/0007 $5.00
136